Dataset statistics
| Number of variables | 23 |
|---|---|
| Number of observations | 35353 |
| Missing cells | 327813 |
| Missing cells (%) | 40.3% |
| Total size in memory | 6.2 MiB |
| Average record size in memory | 184.0 B |
Variable types
| Text | 13 |
|---|---|
| Unsupported | 8 |
| Numeric | 2 |
Binding has constant value "" | Constant |
TRA_leader has 638 (1.8%) missing values | Missing |
TRB_leader has 1063 (3.0%) missing values | Missing |
Linker has 35353 (100.0%) missing values | Missing |
Link_order has 35353 (100.0%) missing values | Missing |
TRA_5_prime_seq has 35353 (100.0%) missing values | Missing |
TRA_3_prime_seq has 35353 (100.0%) missing values | Missing |
TRB_5_prime_seq has 35353 (100.0%) missing values | Missing |
TRB_3_prime_seq has 35353 (100.0%) missing values | Missing |
Linked_nt has 35353 (100.0%) missing values | Missing |
Linked_aa has 35353 (100.0%) missing values | Missing |
Score has 8583 (24.3%) missing values | Missing |
MHC A has 1984 (5.6%) missing values | Missing |
MHC B has 32713 (92.5%) missing values | Missing |
Linker is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Link_order is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
TRA_5_prime_seq is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
TRA_3_prime_seq is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
TRB_5_prime_seq is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
TRB_3_prime_seq is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Linked_nt is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Linked_aa is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Score has 25266 (71.5%) zeros | Zeros |
Reproduction
| Analysis started | 2024-04-10 13:01:27.032422 |
|---|---|
| Analysis finished | 2024-04-10 13:01:30.384468 |
| Duration | 3.35 seconds |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
TRAV
Text
| Distinct | 106 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 276.3 KiB |
Length
| Max length | 15 |
|---|---|
| Median length | 13 |
| Mean length | 10.14035584 |
| Min length | 5 |
Characters and Unicode
| Total characters | 358492 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 12 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | TRAV26-1*01 |
|---|---|
| 2nd row | TRAV20*01 |
| 3rd row | TRAV38-2/DV8*01 |
| 4th row | TRAV26-1*01 |
| 5th row | TRAV20*01 |
| Value | Count | Frequency (%) |
| trav12-2*01 | 2641 | 7.5% |
| trav19*01 | 1815 | 5.1% |
| trav12-1*01 | 1751 | 5.0% |
| trav14/dv4*01 | 1570 | 4.4% |
| trav1-2*01 | 1539 | 4.4% |
| trav13-1*01 | 1509 | 4.3% |
| trav21*01 | 1492 | 4.2% |
| trav29/dv5*01 | 1490 | 4.2% |
| trav35*01 | 1457 | 4.1% |
| trav17*01 | 1427 | 4.0% |
| Other values (95) | 18662 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 58051 | |
| V | 39893 | |
| 0 | 36212 | |
| T | 35353 | |
| A | 35353 | |
| R | 35353 | |
| * | 34804 | |
| 2 | 22095 | 6.2% |
| - | 16539 | 4.6% |
| 3 | 9704 | 2.7% |
| Other values (9) | 35135 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 358492 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 58051 | |
| V | 39893 | |
| 0 | 36212 | |
| T | 35353 | |
| A | 35353 | |
| R | 35353 | |
| * | 34804 | |
| 2 | 22095 | 6.2% |
| - | 16539 | 4.6% |
| 3 | 9704 | 2.7% |
| Other values (9) | 35135 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 358492 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 58051 | |
| V | 39893 | |
| 0 | 36212 | |
| T | 35353 | |
| A | 35353 | |
| R | 35353 | |
| * | 34804 | |
| 2 | 22095 | 6.2% |
| - | 16539 | 4.6% |
| 3 | 9704 | 2.7% |
| Other values (9) | 35135 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 358492 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 58051 | |
| V | 39893 | |
| 0 | 36212 | |
| T | 35353 | |
| A | 35353 | |
| R | 35353 | |
| * | 34804 | |
| 2 | 22095 | 6.2% |
| - | 16539 | 4.6% |
| 3 | 9704 | 2.7% |
| Other values (9) | 35135 |
TRAJ
Text
| Distinct | 104 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 276.3 KiB |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.844991938 |
| Min length | 5 |
Characters and Unicode
| Total characters | 312697 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | TRAJ43*01 |
|---|---|
| 2nd row | TRAJ28*01 |
| 3rd row | TRAJ40*01 |
| 4th row | TRAJ43*01 |
| 5th row | TRAJ28*01 |
| Value | Count | Frequency (%) |
| traj42*01 | 2819 | 8.0% |
| traj43*01 | 1241 | 3.5% |
| traj45*01 | 1192 | 3.4% |
| traj20*01 | 1160 | 3.3% |
| traj49*01 | 1158 | 3.3% |
| traj52*01 | 1156 | 3.3% |
| traj33*01 | 1154 | 3.3% |
| traj40*01 | 1101 | 3.1% |
| traj30*01 | 1066 | 3.0% |
| traj34*01 | 1032 | 2.9% |
| Other values (94) | 22274 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 40997 | |
| 0 | 39334 | |
| T | 35353 | |
| R | 35353 | |
| A | 35353 | |
| J | 35353 | |
| * | 34800 | |
| 4 | 13687 | 4.4% |
| 3 | 12370 | 4.0% |
| 2 | 12165 | 3.9% |
| Other values (5) | 17932 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 312697 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 40997 | |
| 0 | 39334 | |
| T | 35353 | |
| R | 35353 | |
| A | 35353 | |
| J | 35353 | |
| * | 34800 | |
| 4 | 13687 | 4.4% |
| 3 | 12370 | 4.0% |
| 2 | 12165 | 3.9% |
| Other values (5) | 17932 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 312697 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 40997 | |
| 0 | 39334 | |
| T | 35353 | |
| R | 35353 | |
| A | 35353 | |
| J | 35353 | |
| * | 34800 | |
| 4 | 13687 | 4.4% |
| 3 | 12370 | 4.0% |
| 2 | 12165 | 3.9% |
| Other values (5) | 17932 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 312697 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 40997 | |
| 0 | 39334 | |
| T | 35353 | |
| R | 35353 | |
| A | 35353 | |
| J | 35353 | |
| * | 34800 | |
| 4 | 13687 | 4.4% |
| 3 | 12370 | 4.0% |
| 2 | 12165 | 3.9% |
| Other values (5) | 17932 |
TRA_CDR3
Text
| Distinct | 21940 |
|---|---|
| Distinct (%) | 62.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 276.3 KiB |
Length
| Max length | 30 |
|---|---|
| Median length | 25 |
| Mean length | 13.58422199 |
| Min length | 4 |
Characters and Unicode
| Total characters | 480243 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 15593 ? |
|---|---|
| Unique (%) | 44.1% |
Sample
| 1st row | CIVRAPGRADMRF |
|---|---|
| 2nd row | CAVPSGAGSYQLTF |
| 3rd row | CAYRPPGTYKYIF |
| 4th row | CIVRAPGRADMRF |
| 5th row | CAVPSGAGSYQLTF |
| Value | Count | Frequency (%) |
| caglnyggsqgnlif | 188 | 0.5% |
| cagqnyggsqgnlif | 172 | 0.5% |
| caigpgnmltf | 151 | 0.4% |
| caasetsydkvif | 150 | 0.4% |
| cagmnyggsqgnlif | 134 | 0.4% |
| cavdlmktsydkvif | 124 | 0.4% |
| cagggsqgnlif | 107 | 0.3% |
| cadsgggadgltf | 103 | 0.3% |
| camrrpissgsarqltf | 102 | 0.3% |
| cavrdsnyqliw | 64 | 0.2% |
| Other values (21930) | 34058 |
Most occurring characters
| Value | Count | Frequency (%) |
| G | 59175 | |
| A | 51661 | |
| F | 41942 | 8.7% |
| L | 36078 | 7.5% |
| C | 34974 | 7.3% |
| S | 33230 | 6.9% |
| N | 30943 | 6.4% |
| T | 28588 | 6.0% |
| V | 24667 | 5.1% |
| K | 21524 | 4.5% |
| Other values (14) | 117461 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 480243 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| G | 59175 | |
| A | 51661 | |
| F | 41942 | 8.7% |
| L | 36078 | 7.5% |
| C | 34974 | 7.3% |
| S | 33230 | 6.9% |
| N | 30943 | 6.4% |
| T | 28588 | 6.0% |
| V | 24667 | 5.1% |
| K | 21524 | 4.5% |
| Other values (14) | 117461 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 480243 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| G | 59175 | |
| A | 51661 | |
| F | 41942 | 8.7% |
| L | 36078 | 7.5% |
| C | 34974 | 7.3% |
| S | 33230 | 6.9% |
| N | 30943 | 6.4% |
| T | 28588 | 6.0% |
| V | 24667 | 5.1% |
| K | 21524 | 4.5% |
| Other values (14) | 117461 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 480243 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| G | 59175 | |
| A | 51661 | |
| F | 41942 | 8.7% |
| L | 36078 | 7.5% |
| C | 34974 | 7.3% |
| S | 33230 | 6.9% |
| N | 30943 | 6.4% |
| T | 28588 | 6.0% |
| V | 24667 | 5.1% |
| K | 21524 | 4.5% |
| Other values (14) | 117461 |
TRBV
Text
| Distinct | 124 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 276.3 KiB |
Length
| Max length | 15 |
|---|---|
| Median length | 12 |
| Mean length | 9.701835771 |
| Min length | 5 |
Characters and Unicode
| Total characters | 342989 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 16 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | TRBV13*01 |
|---|---|
| 2nd row | TRBV13*01 |
| 3rd row | TRBV14*01 |
| 4th row | TRBV13*01 |
| 5th row | TRBV13*01 |
| Value | Count | Frequency (%) |
| trbv19*01 | 3545 | 10.0% |
| trbv20-1*01 | 2220 | 6.3% |
| trbv27*01 | 2174 | 6.1% |
| trbv7-9*01 | 2081 | 5.9% |
| trbv9*01 | 1660 | 4.7% |
| trbv4-1*01 | 1266 | 3.6% |
| trbv2*01 | 1198 | 3.4% |
| trbv6-5*01 | 1143 | 3.2% |
| trbv5-1*01 | 1022 | 2.9% |
| trbv28*01 | 1010 | 2.9% |
| Other values (112) | 18034 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 53964 | |
| 0 | 38053 | |
| R | 35354 | |
| T | 35353 | |
| B | 35353 | |
| V | 35353 | |
| * | 34302 | |
| - | 22764 | |
| 2 | 13769 | 4.0% |
| 9 | 8736 | 2.5% |
| Other values (9) | 29988 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 342989 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 53964 | |
| 0 | 38053 | |
| R | 35354 | |
| T | 35353 | |
| B | 35353 | |
| V | 35353 | |
| * | 34302 | |
| - | 22764 | |
| 2 | 13769 | 4.0% |
| 9 | 8736 | 2.5% |
| Other values (9) | 29988 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 342989 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 53964 | |
| 0 | 38053 | |
| R | 35354 | |
| T | 35353 | |
| B | 35353 | |
| V | 35353 | |
| * | 34302 | |
| - | 22764 | |
| 2 | 13769 | 4.0% |
| 9 | 8736 | 2.5% |
| Other values (9) | 29988 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 342989 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 53964 | |
| 0 | 38053 | |
| R | 35354 | |
| T | 35353 | |
| B | 35353 | |
| V | 35353 | |
| * | 34302 | |
| - | 22764 | |
| 2 | 13769 | 4.0% |
| 9 | 8736 | 2.5% |
| Other values (9) | 29988 |
TRBJ
Text
| Distinct | 27 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 276.3 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.910898651 |
| Min length | 7 |
Characters and Unicode
| Total characters | 350380 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | TRBJ1-5*01 |
|---|---|
| 2nd row | TRBJ1-5*01 |
| 3rd row | TRBJ2-1*01 |
| 4th row | TRBJ1-5*01 |
| 5th row | TRBJ1-5*01 |
| Value | Count | Frequency (%) |
| trbj2-7*01 | 5812 | |
| trbj2-1*01 | 5738 | |
| trbj1-2*01 | 4144 | |
| trbj1-1*01 | 4078 | |
| trbj2-3*01 | 3611 | |
| trbj2-2*01 | 3250 | |
| trbj2-5*01 | 2313 | 6.5% |
| trbj1-5*01 | 1952 | 5.5% |
| trbj1-4*01 | 1048 | 3.0% |
| trbj1-6*01 | 824 | 2.3% |
| Other values (17) | 2583 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 57240 | |
| T | 35353 | |
| R | 35353 | |
| B | 35353 | |
| J | 35353 | |
| - | 35353 | |
| * | 34303 | |
| 0 | 34303 | |
| 2 | 29831 | |
| 7 | 6034 | 1.7% |
| Other values (4) | 11904 | 3.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 350380 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 57240 | |
| T | 35353 | |
| R | 35353 | |
| B | 35353 | |
| J | 35353 | |
| - | 35353 | |
| * | 34303 | |
| 0 | 34303 | |
| 2 | 29831 | |
| 7 | 6034 | 1.7% |
| Other values (4) | 11904 | 3.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 350380 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 57240 | |
| T | 35353 | |
| R | 35353 | |
| B | 35353 | |
| J | 35353 | |
| - | 35353 | |
| * | 34303 | |
| 0 | 34303 | |
| 2 | 29831 | |
| 7 | 6034 | 1.7% |
| Other values (4) | 11904 | 3.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 350380 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 57240 | |
| T | 35353 | |
| R | 35353 | |
| B | 35353 | |
| J | 35353 | |
| - | 35353 | |
| * | 34303 | |
| 0 | 34303 | |
| 2 | 29831 | |
| 7 | 6034 | 1.7% |
| Other values (4) | 11904 | 3.4% |
TRB_CDR3
Text
| Distinct | 24154 |
|---|---|
| Distinct (%) | 68.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 276.3 KiB |
Length
| Max length | 26 |
|---|---|
| Median length | 24 |
| Mean length | 14.40797103 |
| Min length | 6 |
Characters and Unicode
| Total characters | 509365 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 17895 ? |
|---|---|
| Unique (%) | 50.6% |
Sample
| 1st row | CASSYLPGQGDHYSNQPQHF |
|---|---|
| 2nd row | CASSFEPGQGFYSNQPQHF |
| 3rd row | CASSALASLNEQFF |
| 4th row | CASSYLPGQGDHYSNQPQHF |
| 5th row | CASSFEPGQGFYSNQPQHF |
| Value | Count | Frequency (%) |
| cassirssyeqyf | 272 | 0.8% |
| casswgggshygytf | 145 | 0.4% |
| cassfsgntgelff | 136 | 0.4% |
| casslrdgseaff | 97 | 0.3% |
| cassirsayeqyf | 78 | 0.2% |
| csvdleanygytf | 74 | 0.2% |
| cassvrssyeqyf | 46 | 0.1% |
| cassygaggyneqff | 45 | 0.1% |
| casrtglastdtqyf | 40 | 0.1% |
| cassqdhrmggheklff | 39 | 0.1% |
| Other values (24144) | 34381 |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 81256 | |
| A | 52620 | |
| F | 52220 | |
| G | 51974 | |
| C | 34073 | 6.7% |
| T | 33563 | 6.6% |
| Q | 31804 | 6.2% |
| Y | 31737 | 6.2% |
| E | 29020 | 5.7% |
| L | 21464 | 4.2% |
| Other values (11) | 89634 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 509365 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 81256 | |
| A | 52620 | |
| F | 52220 | |
| G | 51974 | |
| C | 34073 | 6.7% |
| T | 33563 | 6.6% |
| Q | 31804 | 6.2% |
| Y | 31737 | 6.2% |
| E | 29020 | 5.7% |
| L | 21464 | 4.2% |
| Other values (11) | 89634 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 509365 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 81256 | |
| A | 52620 | |
| F | 52220 | |
| G | 51974 | |
| C | 34073 | 6.7% |
| T | 33563 | 6.6% |
| Q | 31804 | 6.2% |
| Y | 31737 | 6.2% |
| E | 29020 | 5.7% |
| L | 21464 | 4.2% |
| Other values (11) | 89634 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 509365 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 81256 | |
| A | 52620 | |
| F | 52220 | |
| G | 51974 | |
| C | 34073 | 6.7% |
| T | 33563 | 6.6% |
| Q | 31804 | 6.2% |
| Y | 31737 | 6.2% |
| E | 29020 | 5.7% |
| L | 21464 | 4.2% |
| Other values (11) | 89634 |
TRA_leader
Text
MISSING 
| Distinct | 51 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 638 |
| Missing (%) | 1.8% |
| Memory size | 276.3 KiB |
Length
| Max length | 18 |
|---|---|
| Median length | 16 |
| Mean length | 13.18479044 |
| Min length | 11 |
Characters and Unicode
| Total characters | 457710 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | TRAV26-1*01(L) |
|---|---|
| 2nd row | TRAV20*01(L) |
| 3rd row | TRAV38-2/DV8*01(L) |
| 4th row | TRAV26-1*01(L) |
| 5th row | TRAV20*01(L) |
| Value | Count | Frequency (%) |
| trav12-2*01(l | 2643 | 7.6% |
| trav19*01(l | 1815 | 5.2% |
| trav12-1*01(l | 1748 | 5.0% |
| trav13-1*01(l | 1590 | 4.6% |
| trav14/dv4*01(l | 1572 | 4.5% |
| trav1-2*01(l | 1537 | 4.4% |
| trav21*01(l | 1492 | 4.3% |
| trav29/dv5*01(l | 1484 | 4.3% |
| trav35*01(l | 1456 | 4.2% |
| trav17*01(l | 1407 | 4.1% |
| Other values (41) | 17971 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 57658 | |
| V | 39186 | |
| 0 | 36107 | |
| T | 34715 | |
| ) | 34715 | |
| R | 34715 | |
| ( | 34715 | |
| L | 34715 | |
| * | 34715 | |
| A | 34715 | |
| Other values (11) | 81754 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 457710 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 57658 | |
| V | 39186 | |
| 0 | 36107 | |
| T | 34715 | |
| ) | 34715 | |
| R | 34715 | |
| ( | 34715 | |
| L | 34715 | |
| * | 34715 | |
| A | 34715 | |
| Other values (11) | 81754 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 457710 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 57658 | |
| V | 39186 | |
| 0 | 36107 | |
| T | 34715 | |
| ) | 34715 | |
| R | 34715 | |
| ( | 34715 | |
| L | 34715 | |
| * | 34715 | |
| A | 34715 | |
| Other values (11) | 81754 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 457710 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 57658 | |
| V | 39186 | |
| 0 | 36107 | |
| T | 34715 | |
| ) | 34715 | |
| R | 34715 | |
| ( | 34715 | |
| L | 34715 | |
| * | 34715 | |
| A | 34715 | |
| Other values (11) | 81754 |
TRB_leader
Text
MISSING 
| Distinct | 69 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 1063 |
| Missing (%) | 3.0% |
| Memory size | 276.3 KiB |
Length
| Max length | 18 |
|---|---|
| Median length | 14 |
| Mean length | 12.7951881 |
| Min length | 11 |
Characters and Unicode
| Total characters | 438747 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 7 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | TRBV13*01(L) |
|---|---|
| 2nd row | TRBV13*01(L) |
| 3rd row | TRBV14*01(L) |
| 4th row | TRBV13*01(L) |
| 5th row | TRBV13*01(L) |
| Value | Count | Frequency (%) |
| trbv19*01(l | 3546 | 10.3% |
| trbv20-1*01(l | 2218 | 6.5% |
| trbv27*01(l | 2172 | 6.3% |
| trbv7-9*01(l | 2081 | 6.1% |
| trbv9*01(l | 1660 | 4.8% |
| trbv4-1*01(l | 1266 | 3.7% |
| trbv2*01(l | 1198 | 3.5% |
| trbv6-5*01(l | 1143 | 3.3% |
| trbv5-1*01(l | 1019 | 3.0% |
| trbv28*01(l | 1009 | 2.9% |
| Other values (59) | 16978 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 53494 | |
| 0 | 37922 | |
| R | 34291 | |
| T | 34290 | |
| L | 34290 | |
| ( | 34290 | |
| ) | 34290 | |
| * | 34290 | |
| V | 34290 | |
| B | 34290 | |
| Other values (11) | 73010 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 438747 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 53494 | |
| 0 | 37922 | |
| R | 34291 | |
| T | 34290 | |
| L | 34290 | |
| ( | 34290 | |
| ) | 34290 | |
| * | 34290 | |
| V | 34290 | |
| B | 34290 | |
| Other values (11) | 73010 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 438747 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 53494 | |
| 0 | 37922 | |
| R | 34291 | |
| T | 34290 | |
| L | 34290 | |
| ( | 34290 | |
| ) | 34290 | |
| * | 34290 | |
| V | 34290 | |
| B | 34290 | |
| Other values (11) | 73010 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 438747 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 53494 | |
| 0 | 37922 | |
| R | 34291 | |
| T | 34290 | |
| L | 34290 | |
| ( | 34290 | |
| ) | 34290 | |
| * | 34290 | |
| V | 34290 | |
| B | 34290 | |
| Other values (11) | 73010 |
Linker
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 35353 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 276.3 KiB |
Link_order
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 35353 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 276.3 KiB |
TRA_5_prime_seq
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 35353 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 276.3 KiB |
TRA_3_prime_seq
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 35353 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 276.3 KiB |
TRB_5_prime_seq
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 35353 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 276.3 KiB |
TRB_3_prime_seq
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 35353 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 276.3 KiB |
Linked_nt
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 35353 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 276.3 KiB |
Linked_aa
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 35353 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 276.3 KiB |
Warnings/Errors
Text
| Distinct | 6219 |
|---|---|
| Distinct (%) | 17.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 276.3 KiB |
Length
| Max length | 3568 |
|---|---|
| Median length | 6 |
| Mean length | 404.7933132 |
| Min length | 6 |
Characters and Unicode
| Total characters | 14310658 |
|---|---|
| Distinct characters | 73 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 5429 ? |
|---|---|
| Unique (%) | 15.4% |
Sample
| 1st row | [None] |
|---|---|
| 2nd row | [None] |
| 3rd row | [None] |
| 4th row | [None] |
| 5th row | [None] |
| Value | Count | Frequency (%) |
| the | 187578 | 8.0% |
| for | 154645 | 6.6% |
| allele | 151076 | 6.4% |
| region | 121735 | 5.2% |
| 01 | 75403 | 3.2% |
| being | 75401 | 3.2% |
| trb | 68601 | 2.9% |
| tra | 62114 | 2.6% |
| is | 60004 | 2.6% |
| a | 51473 | 2.2% |
| Other values (411) | 1338065 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2430202 | ||
| e | 1843482 | |
| l | 942992 | 6.6% |
| i | 889883 | 6.2% |
| a | 739130 | 5.2% |
| r | 722217 | 5.0% |
| o | 653005 | 4.6% |
| t | 596039 | 4.2% |
| n | 585767 | 4.1% |
| d | 431554 | 3.0% |
| Other values (63) | 4476387 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 14310658 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2430202 | ||
| e | 1843482 | |
| l | 942992 | 6.6% |
| i | 889883 | 6.2% |
| a | 739130 | 5.2% |
| r | 722217 | 5.0% |
| o | 653005 | 4.6% |
| t | 596039 | 4.2% |
| n | 585767 | 4.1% |
| d | 431554 | 3.0% |
| Other values (63) | 4476387 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 14310658 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2430202 | ||
| e | 1843482 | |
| l | 942992 | 6.6% |
| i | 889883 | 6.2% |
| a | 739130 | 5.2% |
| r | 722217 | 5.0% |
| o | 653005 | 4.6% |
| t | 596039 | 4.2% |
| n | 585767 | 4.1% |
| d | 431554 | 3.0% |
| Other values (63) | 4476387 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 14310658 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2430202 | ||
| e | 1843482 | |
| l | 942992 | 6.6% |
| i | 889883 | 6.2% |
| a | 739130 | 5.2% |
| r | 722217 | 5.0% |
| o | 653005 | 4.6% |
| t | 596039 | 4.2% |
| n | 585767 | 4.1% |
| d | 431554 | 3.0% |
| Other values (63) | 4476387 |
Epitope
Text
| Distinct | 1289 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 276.3 KiB |
Length
| Max length | 33 |
|---|---|
| Median length | 9 |
| Mean length | 9.49497921 |
| Min length | 7 |
Characters and Unicode
| Total characters | 335676 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 367 ? |
|---|---|
| Unique (%) | 1.0% |
Sample
| 1st row | FLKEKGGL |
|---|---|
| 2nd row | FLKEKGGL |
| 3rd row | FLKEKGGL |
| 4th row | FLKEQGGL |
| 5th row | FLKEQGGL |
| Value | Count | Frequency (%) |
| klggalqak | 13619 | |
| gilgfvftl | 2489 | 7.0% |
| avfdrksdak | 1719 | 4.9% |
| rakfkqll | 1212 | 3.4% |
| tfeyvsqpflmdle | 976 | 2.8% |
| llwngpmav | 840 | 2.4% |
| ylqprtfll | 838 | 2.4% |
| sprwyfyyl | 793 | 2.2% |
| ttdpsflgry | 767 | 2.2% |
| ivtdfsvik | 718 | 2.0% |
| Other values (1279) | 11382 |
Most occurring characters
| Value | Count | Frequency (%) |
| L | 57752 | |
| A | 42150 | |
| G | 40074 | |
| K | 37975 | |
| Q | 22033 | 6.6% |
| F | 18420 | 5.5% |
| V | 16621 | 5.0% |
| T | 14648 | 4.4% |
| P | 11504 | 3.4% |
| I | 10427 | 3.1% |
| Other values (10) | 64072 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 335676 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| L | 57752 | |
| A | 42150 | |
| G | 40074 | |
| K | 37975 | |
| Q | 22033 | 6.6% |
| F | 18420 | 5.5% |
| V | 16621 | 5.0% |
| T | 14648 | 4.4% |
| P | 11504 | 3.4% |
| I | 10427 | 3.1% |
| Other values (10) | 64072 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 335676 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| L | 57752 | |
| A | 42150 | |
| G | 40074 | |
| K | 37975 | |
| Q | 22033 | 6.6% |
| F | 18420 | 5.5% |
| V | 16621 | 5.0% |
| T | 14648 | 4.4% |
| P | 11504 | 3.4% |
| I | 10427 | 3.1% |
| Other values (10) | 64072 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 335676 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| L | 57752 | |
| A | 42150 | |
| G | 40074 | |
| K | 37975 | |
| Q | 22033 | 6.6% |
| F | 18420 | 5.5% |
| V | 16621 | 5.0% |
| T | 14648 | 4.4% |
| P | 11504 | 3.4% |
| I | 10427 | 3.1% |
| Other values (10) | 64072 |
Score
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 8583 |
| Missing (%) | 24.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1078446022 |
| Minimum | 0 |
|---|---|
| Maximum | 3 |
| Zeros | 25266 |
| Zeros (%) | 71.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 276.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 3 |
| Range | 3 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.4827191399 |
|---|---|
| Coefficient of variation (CV) | 4.476062132 |
| Kurtosis | 23.11594445 |
| Mean | 0.1078446022 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.819080067 |
| Sum | 2887 |
| Variance | 0.2330177681 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 25266 | |
| 1 | 569 | 1.6% |
| 2 | 487 | 1.4% |
| 3 | 448 | 1.3% |
| (Missing) | 8583 | 24.3% |
| Value | Count | Frequency (%) |
| 0 | 25266 | |
| 1 | 569 | 1.6% |
| 2 | 487 | 1.4% |
| 3 | 448 | 1.3% |
| Value | Count | Frequency (%) |
| 3 | 448 | 1.3% |
| 2 | 487 | 1.4% |
| 1 | 569 | 1.6% |
| 0 | 25266 |
MHC A
Text
MISSING 
| Distinct | 98 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 1984 |
| Missing (%) | 5.6% |
| Memory size | 276.3 KiB |
Length
| Max length | 20 |
|---|---|
| Median length | 11 |
| Mean length | 11.01417483 |
| Min length | 6 |
Characters and Unicode
| Total characters | 367532 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 20 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | HLA-B*08 |
|---|---|
| 2nd row | HLA-B*08 |
| 3rd row | HLA-B*08 |
| 4th row | HLA-B*08 |
| 5th row | HLA-B*08 |
| Value | Count | Frequency (%) |
| hla-a*03:01 | 14287 | |
| hla-a*02:01 | 8844 | |
| hla-a*11:01 | 2489 | 7.5% |
| hla-a*01:01 | 1752 | 5.3% |
| hla-b*07:02 | 1584 | 4.7% |
| hla-b*08:01 | 1242 | 3.7% |
| hla-a*24:02 | 751 | 2.3% |
| hla-dqa1*05:01 | 345 | 1.0% |
| hla-a*02 | 247 | 0.7% |
| hla-b*15:01 | 234 | 0.7% |
| Other values (88) | 1594 | 4.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 62771 | |
| 0 | 61827 | |
| 1 | 38118 | |
| H | 33369 | |
| L | 33369 | |
| - | 33369 | |
| * | 33159 | |
| : | 32982 | |
| 3 | 14592 | 4.0% |
| 2 | 12609 | 3.4% |
| Other values (13) | 11367 | 3.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 367532 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| A | 62771 | |
| 0 | 61827 | |
| 1 | 38118 | |
| H | 33369 | |
| L | 33369 | |
| - | 33369 | |
| * | 33159 | |
| : | 32982 | |
| 3 | 14592 | 4.0% |
| 2 | 12609 | 3.4% |
| Other values (13) | 11367 | 3.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 367532 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| A | 62771 | |
| 0 | 61827 | |
| 1 | 38118 | |
| H | 33369 | |
| L | 33369 | |
| - | 33369 | |
| * | 33159 | |
| : | 32982 | |
| 3 | 14592 | 4.0% |
| 2 | 12609 | 3.4% |
| Other values (13) | 11367 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 367532 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| A | 62771 | |
| 0 | 61827 | |
| 1 | 38118 | |
| H | 33369 | |
| L | 33369 | |
| - | 33369 | |
| * | 33159 | |
| : | 32982 | |
| 3 | 14592 | 4.0% |
| 2 | 12609 | 3.4% |
| Other values (13) | 11367 | 3.1% |
MHC B
Text
MISSING 
| Distinct | 51 |
|---|---|
| Distinct (%) | 1.9% |
| Missing | 32713 |
| Missing (%) | 92.5% |
| Memory size | 276.3 KiB |
Length
| Max length | 20 |
|---|---|
| Median length | 14 |
| Mean length | 12.78522727 |
| Min length | 8 |
Characters and Unicode
| Total characters | 33753 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 13 ? |
|---|---|
| Unique (%) | 0.5% |
Sample
| 1st row | HLA-DQB1*06:02 |
|---|---|
| 2nd row | HLA-DQB1*06:02 |
| 3rd row | HLA-DQB1*06:02 |
| 4th row | HLA-DRB1*15:03 |
| 5th row | HLA-DRB1*15:03 |
| Value | Count | Frequency (%) |
| hla-a*02:01 | 850 | |
| hla-dpb1*04:01 | 594 | |
| hla-dqb1*06:02 | 491 | |
| hla-drb1*07:01 | 131 | 5.0% |
| hla-a*02 | 97 | 3.7% |
| hla-dqb1*02:01 | 86 | 3.3% |
| hla-drb1*04:01 | 81 | 3.1% |
| hla-drb1*15:01 | 57 | 2.2% |
| hla-drb3*03:01 | 38 | 1.4% |
| hla-a*24:02 | 32 | 1.2% |
| Other values (41) | 183 | 6.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5048 | |
| 1 | 3711 | |
| A | 3628 | |
| H | 2640 | |
| L | 2640 | |
| - | 2640 | |
| * | 2640 | |
| : | 2577 | |
| 2 | 1662 | 4.9% |
| B | 1648 | 4.9% |
| Other values (12) | 4919 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 33753 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 5048 | |
| 1 | 3711 | |
| A | 3628 | |
| H | 2640 | |
| L | 2640 | |
| - | 2640 | |
| * | 2640 | |
| : | 2577 | |
| 2 | 1662 | 4.9% |
| B | 1648 | 4.9% |
| Other values (12) | 4919 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 33753 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 5048 | |
| 1 | 3711 | |
| A | 3628 | |
| H | 2640 | |
| L | 2640 | |
| - | 2640 | |
| * | 2640 | |
| : | 2577 | |
| 2 | 1662 | 4.9% |
| B | 1648 | 4.9% |
| Other values (12) | 4919 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 33753 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 5048 | |
| 1 | 3711 | |
| A | 3628 | |
| H | 2640 | |
| L | 2640 | |
| - | 2640 | |
| * | 2640 | |
| : | 2577 | |
| 2 | 1662 | 4.9% |
| B | 1648 | 4.9% |
| Other values (12) | 4919 |
MHC class
Text
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 8 |
| Missing (%) | < 0.1% |
| Memory size | 276.3 KiB |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.077945961 |
| Min length | 4 |
Characters and Unicode
| Total characters | 144135 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | MHCI |
|---|---|
| 2nd row | MHCI |
| 3rd row | MHCI |
| 4th row | MHCI |
| 5th row | MHCI |
| Value | Count | Frequency (%) |
| mhci | 32590 | |
| mhcii | 2755 | 7.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| I | 38100 | |
| M | 35345 | |
| H | 35345 | |
| C | 35345 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 144135 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| I | 38100 | |
| M | 35345 | |
| H | 35345 | |
| C | 35345 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 144135 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| I | 38100 | |
| M | 35345 | |
| H | 35345 | |
| C | 35345 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 144135 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| I | 38100 | |
| M | 35345 | |
| H | 35345 | |
| C | 35345 |
Binding
Real number (ℝ)
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1 |
| Minimum | 1 |
|---|---|
| Maximum | 1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 276.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 0 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0 |
|---|---|
| Coefficient of variation (CV) | 0 |
| Kurtosis | 0 |
| Mean | 1 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0 |
| Sum | 35353 |
| Variance | 0 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 1 | 35353 |
| Value | Count | Frequency (%) |
| 1 | 35353 |
| Value | Count | Frequency (%) |
| 1 | 35353 |